Building a modern data stack at Zip from Coalesce 2023

Moss Pauly, senior manager of data products at Zip, explains Zip's journey in building its data platform.

"At the end of the day, it's about making your data platform your own and making sure that you're solving the problems that you face and your business faces."

Moss Pauly, senior manager of data products at Zip, explains Zip's journey in building their data platform. He goes through the principles Zip’s data team used to make decisions when building out their stack, the challenges that they faced, and their solutions.

Building a data platform requires context and understanding of company structure

Moss emphasizes the importance of understanding the company's structure and context when building a data platform. He says, "We're all building these data platforms to solve problems that we're facing individually or as a business and as such. Although there [are] general problems in this space, there's a lot that [is] specific to a vertical of businesses–or your specific business potentially."

He discusses how their data teams were divided into areas of expertise, including data analytics, data science, and data engineering, and how they worked in partnership with the wider business, including product, engineering, and operational teams. He emphasizes, "We don't just work in isolation. We work in partnership with a broader business and our stakeholders."

Migration from one data platform to another is a journey that requires deep thinking

Moss explains Zip’s journey of migrating to a new data platform. He emphasizes the critical role of deep thinking in making the transition successful, stating, "We had a clear picture in our heads of how everything was going to interrelate, and I think that's paid dividends for us."

Moss stresses the necessity of deviating from standards and recommendations if it leads to better results for the specific needs of the company. He notes, "Preemptive overengineering is sometimes the root of all evil. In some cases, it is worth overengineering at the start if you've got a very clear goal of where you're ending up."

"We looked at the standards and recommendations and thought ‘This is a great starting point for a data platform.’ But one of the catalysts for us in looking at this entire project and building out a data platform was that we wanted to be able to deal with high volume event data, which the standards and recommendations didn't cater for," he adds.

Respect for customer data is paramount in building a data platform

Moss highlights the importance of respecting customer data when building a data platform. He shares Zip’s goal of creating a solution that respects customer privacy, while still providing an open data platform. He states, "One of our goals when we set out building out our data platform was that we wanted to make our data platform as open as possible. We didn't want to have a situation where we…had to make decisions about who had access to the platform and who didn't because of the sensitive data that was present there."

He discusses the strategy they used, which involved using different roles for data engineering and model development, and having a specific role for those who needed to see personal identifiable information. He affirms, "I've got to say this worked really, really well for us…the role structure has stayed the same, and it's been really easy to manage in dbt using macros around making sure that the right role owns the right tables."

Insights surfaced

  • Having a deep understanding of tools and systems can provide elegant solutions to complex problems
  • It's crucial to balance cost efficiency with cost observability, especially when scaling data operations
  • The adoption of new tools should be balanced with governance to ensure long-term scalability
  • Deviating from standards and recommendations can sometimes yield better results, especially when the standard approach doesn't align with business goals or needs
  • Iterative and incremental development can be more beneficial than preemptive over-engineering
Related Articles

Register for Coalesce 2024

Join us in-person or online for the largest analytics engineering conference. Level-up your skillset, expand your network, and build your path at Coalesce 2024.